A HS-PRP-Type Hybrid Conjugate Gradient Method with Sufficient Descent Property

In this paper, based on the HS method and a modified version of the PRP method, a hybrid conjugate gradient (CG) method is proposed for solving large-scale unconstrained optimization problems. The CG parameter generated by the method is always nonnegative. Moreover, the search direction possesses the sufficient descent property independent of line search. Utilizing the standard Wolfe–Powell line search rule to yield the stepsize, the global convergence of the proposed method is shown under the common assumptions. Finally, numerical results show that the proposed method is promising compared with two existing methods.


Introduction
Consider the problem of minimizing f over R n : where f: R n ⟶ R is continuously differentiable. roughout, the gradient of f at x is denoted by g(x), i.e., g(x): � ∇f(x). We know that conjugate gradient (CG) methods are very popular and effective for solving unconstrained optimization problems (1), especially for large-scale case by means of their simplicity and low memory requirements. ese preferred features greatly promote their applications in various areas such as image deblurring and denoising, neural network, compressed sensing, and others. We refer the interested readers to some recent works [1][2][3] and references therein for more details. e numerical results reported in [1] reveal that the CG method has great potential in solving image restoration problems.
Generally, the iterative formula of the CG method for solving problem (1) can be read as where α k > 0 is called the stepsize computed by some line search. Here, d k is commonly known as the search direction, which is defined as follows: where β k ∈ R is the so-called CG parameter and g k is the abbreviation of g(x k ), i.e., g k : � g(x k ). e two key factors that affect the numerical performance of the CG method are the stepsize and the CG parameter. First, we outline several well-known line search criteria in the literature.
and g x k + α k d k T d k ≥ σg T k d k , (6) where 0 < δ < σ < 1. (c) e strong Wolfe-Powell (SWP) line search rule: calculate a stepsize α k satisfying (5) and On the other hand, different CG methods are determined by different CG parameters. e well-known CG methods include the Fletcher-Reeves (FR) [4], Polak-Ribière-Polyak (PRP) [5,6], Hestenes-Stiefel (HS) [7], Liu-Storey (LS) [8], Fletcher (CD) [9], and Dai-Yuan (DY) [10] methods, and their CG parameters β k are, respectively, given by where y k− 1 : � g k − g k− 1 and ‖ · ‖ stands for the Euclidean norm. e methods yielded by the above CG parameters are called the classical CG methods, and their convergence analysis and numerical performance have been extensively studied (see, e.g., [4][5][6][7][8][9][10][11][12]). It has been shown that the above formulas for the CG parameters are equivalent when f(x) is convex quadratic and the stepsize α k is obtained by carrying out the exact line search rule (4). However, their numerical performance strongly depends on the CG parameter β k . e FR, CD, and DY methods possess good convergence, but the numerical performance for these methods is somewhat unsatisfactory for solving general unconstrained nonlinear optimization problems [12][13][14]. On the contrary, it has been shown that the convergence properties of PRP, HS, and LS methods are not so well, but they often possess better computational performance [12][13][14]. erefore, in the past few decades, based on the above formulas, plenty of formulas for β k are designed for CG methods that possess both good global convergence properties and promising numerical performance (see [12][13][14][15][16] and references therein).
To our knowledge, the first hybrid CG method in the literature was proposed by Touati-Ahmed and Storey [17] (TS method), where β k is computed as Apparently, the TS method has some good properties of FR and PRP methods since β TS k is a hybrid of β FR k and β PRP k . Combined with HS and DY methods, Dai and Yuan [18] proposed another hybrid CG method (hHD method), in which the hybrid CG parameter β k is obtained by When the WWP line search rule is used to compute the stepsize, the resulting search direction in [18] is a descent one and the global convergence for the hHD method is proved. Moreover, the numerical experiments reported in [18] illustrated that the hHD method is competitive and practicable. For other closely related works, we refer the readers to [18,19] and the references therein. It is worth noting that the CG parameters β k defined in [17][18][19] are restricted to positive values. As explicated in [19], this restriction in turn results in global convergence of the algorithm. In recent years, many hybrid CG methods were proposed on the basis of the methodology of discrete combinations of several CG parameters (see, e.g., [1,13,[20][21][22][23]). e combination parameter is computed by some secant equations [13,20], the conjugacy condition [21,22], or by minimizing the least-squares problem consisting of the unknown search direction and an existing one (see [23] and the references therein).
In 2016, Wei et al. [24] introduced a modified PRP method, usually called the WYL method, where the corresponding parameter β k is yielded by Under the assumption that d k generated by Wei et al. [24] satisfies the so-called sufficient descent condition the WYL method is globally convergent under the WWP line search rule and possesses superior numerical performance. Subsequently, Dai and Wen [25] proposed two improved CG methods with sufficient descent property. e CG parameters β k in [25] are defined as 2 Computational Intelligence and Neuroscience where μ > 1. Clearly, the search direction yielded by β DPRP k satisfies the sufficient descent condition without depending on any line search. However, the sufficient descent property associated with β DHS k relies on the WWP line search rule. Based on the above observations, it is interesting to design a hybrid CG method such that the CG parameter is nonnegative and the resulting search direction possesses the sufficient descent property independent of line search technique. Motivated by the methods in [24,25] and considering that the HS method performs best among the classical CG methods, a new formula for the CG parameter β k is given by where . Interestingly, the above parameter β hHPR k is always nonnegative. To see this, let θ k be the angle between g k and g k− 1 . us, we know from (14) that which further implies Moreover, plugging the CG parameter β k : � β hHPR k into (3), we can show that the resulting search direction possesses the sufficient descent property independent of line search technique (see Lemma 1 below). e structure of this paper is organized as follows. In Section 2, our algorithm framework is presented, and the sufficient descent property with respect to the resulting search direction is discussed in detail. Section 3 is devoted to establishing the convergence of the proposed method with the WWP line search rule. In the last section, some preliminary numerical results are reported to verify the efficiency of the presented method.

The Algorithm
In this section, we first propose the algorithm framework for solving problem (1), in which we do not specify which line search rule generates the stepsize. Subsequently, we analyze the sufficient descent property for the search direction. By inserting the WWP line search rule into the algorithm framework, our hybrid CG method is proposed. e following lemma shows that the direction sequence d k generated by Algorithm 1 possesses the sufficient descent property independent of any line search.

Lemma 1.
Let d k be a sequence generated by Algorithm 1. en, for some constant M ∈ (0, 1), it holds that Suppose that g T k d k− 1 ≠ 0 for all k ≥ 2. It then follows from (3), (15), and (16) that which completes the proof.

□
For convenience, in the following statements, we call the method generated by Algorithm 1 with the WWP line search rule as the hHPR CG method.
Step 2: compute the stepsize α k by an appropriate line search rule.
Step 3: generate the new iteration point by x k+1 � x k + α k d k and compute β k : � β hHPR k according to (14).
Computational Intelligence and Neuroscience 3

Convergence
In this section, we analyze the convergence for the hHPR CG method. For this goal, the following common assumptions are necessary.
Assumption 1 Here, x 1 is the given initial point. (ii) In some neighborhood N of the level set Ω, the objective function f(x) is continuously differentiable, and its gradient g(x) is Lipschitz continuous, i.e., there exists a constant L > 0 such that e following lemma provides the convergence for the PRP-type CG method, which was originally introduced in [19]. (2) and (3) with the following three properties:

Lemma 2. Consider the general CG method
(i) e CG parameter is always nonnegative, i.e., β k ≥ 0 for all k ≥ 1. (ii) e line search satisfies (5) and (6) and the sufficient descent condition. (iii) Property ( * ) holds. en, Property 1. ( * ) Consider a method of forms (2) and (3). Suppose that We say that the method has property ( * ), if for all k ≥ 1, there exist constants b > 1 and λ > 0 such that |β k | ≤ b, and if From (16) and Lemmas 1 and 2, to obtain the global convergence of the hHPR CG method, we only prove that our method owns property ( * ). Proof. Considering the method of forms (2) and (3) and using the constants c and c in (22), we have from (16) that

Numerical Experiments
In this section, we verify the efficiency and robustness of the hHPR CG method (hHPR for short) by solving some classical tested problems and compare it with two wellknown CG methods: DHS and DPRP in [25].
For the tested problems, some of them are from the wellknown CUTE library in [26] and the others come from [27]. Moreover, their dimensions range from 2 to 1000000. All codes were written in MATLAB R2016a, and the numerical experiments were conducted on a Dell PC with Intel Core CPU 3.00 GHz and 16.00 GB RAM. For the aforementioned methods, we reset the search direction by taking d k : � − g k once an ascent direction occurs. For the sake of fairness, all the stepsizes α k are yielded by the WWP line search rule following a bisection algorithm proposed in [28], and the corresponding parameters are set to δ � 0.01 and σ � 0.1. Moreover, we adopt the strategy described in [29] to compute the initial stepsize.
e numerical results are listed in Tables 1 and 2, where "TP" denotes the tested problems used in numerical experiments and "Dim" stands for the dimension of the tested problems.
As we all know, the performance profile introduced in [30] is very useful in measuring the performance of numerical algorithms. Figures 1 and 2 plot the performance 4 Computational Intelligence and Neuroscience profiles of hHPR, DHS, and DPRP in terms of Itr and Tcpu, respectively. Based on the left side of Figures 1 and 2, the proposed method is clearly above the other two curves, and this in turn shows that compared with DHS and DPRP, our proposed method is efficient and encouraging. On the other hand, based on the right side of Figures 1 and 2, our proposed method can successfully solve about 90% of the tested problems and clearly outperforms the other two methods.

Data Availability
All the datasets used in this paper are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.  Computational Intelligence and Neuroscience 7